Smooth soft mel-spectrographic masks based on blind sparse source separation
نویسندگان
چکیده
This paper investigates the use of DUET, a recently proposed blind source separation method, as front-end for missing data speech recognition. Based on the attenuation and delay estimation in stereo signals soft time-frequency masks are designed to extract a target speaker from a mixture containing multiple speech sources. A postprocessing step is introduced in order to remove isolated mask points that can cause insertion errors in the speech decoder. The results for connected digit experiments in a multi-speaker environment demonstrate that the proposed soft masks closely match the performance of the oracle mask designed with a priori knowledge of the source spectra.
منابع مشابه
Blind Source Separation using Relative Newton Method combined with Smoothing Method of Multipliers
We study a relative optimization framework for quasi-maximum likelihood blind source separation and relative Newton method as its particular instance. The structure of the Hessian allows its fast approximate inversion. In the second part we present Smoothing Method of Multipliers (SMOM) for minimization of sum of pairwise maxima of smooth functions, in particular sum of absolute value terms. In...
متن کامل7 Relative Newton and Smoothing Multiplier Optimization Methods for Blind Source Separation
We study a relative optimization framework for quasi-maximum likelihood blind source separation and relative Newton method as its particular instance. The structure of the Hessian allows its fast approximate inversion. In the second part we present Smoothing Method of Multipliers (SMOM) for minimization of sum of pairwise maxima of smooth functions, in particular sum of absolute value terms. In...
متن کاملBlind Source Separation with Relative Newton Method
We study a relative optimization framework for the quasimaximum likelihood blind source separation and relative Newton method as its particular instance. Convergence of the Newton method is stabilized by the line search and by the modification of the Hessian, which forces its positive definiteness. The structure of the Hessian allows fast approximate inversion. We demonstrate the efficiency of ...
متن کاملCombined Multi-Channel NMF-Based Robust Beamforming for Noisy Speech Recognition
We propose a novel acoustic beamforming method using blind source separation (BSS) techniques based on non-negative matrix factorization (NMF). In conventional mask-based approaches, hard or soft masks are estimated and beamforming is performed using speech and noise spatial covariance matrices calculated from masked noisy observations, but the phase information of the target speech is not adeq...
متن کاملSéparation de sources par lissage cepstral des masques binaires (Source separation by cepstral smoothing of binary masks) [in French]
Source separation by cepstral smoothing of binary masks In this paper, we propose a separation system of speech signals from two convolutive mixtures. The suggested system is based on the combination of blind source separation technique with a time-frequency masking procedure, followed by a smoothing cepstral. Indeed, after separation of signal sources, the estimated binary masks undergo a ceps...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007